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In this paper, we study the role non-adaptivity plays in maintaining dynamic data struc- 
tures. Roughly speaking, a data structure is non-adaptive if the memory locations it reads 
and/or writes when processing a query or update depend only on the query or update and 
not on the contents of previously read cells. We study such non-adaptive data structures 
in the cell probe model. This model is one of the least restrictive lower bound models and 
in particular, cell probe lower bounds apply to data structures developed in the popular 
word-RAM model. Unfortunately, this generality comes at a high cost: the highest lower 
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c/2 . bound proved for any data structure problem is only poly logarithmic. Our main result is to 

demonstrate that one can in fact obtain polynomial cell probe lower bounds for non-adaptive 
data structures. 

To shed more light on the seemingly inherent polylogarithmic lower bound barrier, we 
study several different notions of non-adaptivity and identify key properties that must be 
dealt with if we are to prove polynomial lower bounds without restrictions on the data 
qq . structures. 
fSJ ' Finally, our results also unveil an interesting connection between data structures and 

depth-2 circuits. This allows us to translate conjectured hard data structure problems into 
good candidates for high circuit lower bounds; in particular, in the area of linear circuits 
for linear operators. Building on lower bound proofs for data structures in slightly more 
restrictive models, we also present a number of properties of linear operators which we 
believe are worth investigating in the realm of circuit lower bounds. 
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1 Introduction 



Proving lower bounds on the performance of data structures has been an important line of 
research for decades. Over time, numerous computational models have been proposed, of which 
the cell probe model of Yao [21] is the least restrictive. Lower bounds proved in this model apply- 
to essentially any imaginable data structure, including those developed in the most popular 
upper bound model, the word-RAM. Much effort has therefore been spent on deriving cell 
probe lower bounds for natural data structure problems. Nevertheless, the highest lower bound 
that has been proved for any data structure problem remains just polylogarithmic. 

In this paper, we consider a natural restriction of data structures, namely non-adaptivity. 
Roughly speaking, a non-adaptive data structure is a data structure for which the memory 
locations read when answering a query or processing an update depend only on the query or 
update itself, and not on the contents of the previously read memory locations. Surprisingly, 
we are able to derive polynomially high cell probe lower bounds for such data structures. 

1.1 The Cell Probe Model 

In the cell probe model, a data structure consists of a collection of memory cells, each storing 
w bits. Each cell has an integer address amongst [2 W ] = {1, . . . ,2 W }, i.e. we assume any cell 
has enough bits to address any other cell. When a data structure is presented with a query, the 
query algorithm starts reading, or probing, cells of the memory. The cell probed at each step 
may depend arbitrarily on the query and the contents of all cells probed so far. After probing 
a number of cells, the query algorithm terminates with the answer to the query. 

A dynamic data structure in the cell probe model must also support updates. When pre- 
sented with an update, the update algorithm similarly starts reading and/or writing cells of the 
data structures. We refer jointly to reading or writing a cell as probing the cell. The cell probed 
at each step, and the contents written to a cell at each step, may again depend arbitrarily on 
the update operation and the cells probed so far. 

The query and update times of a cell probe data structure are defined as the number of cells 
probed when answering a query or update respectively. The space usage is simply defined as 
the largest address used by any cell of the data structure. 

1.2 Previous Cell Probe Lower Bound Techniques 

As mentioned, the state-of-the-art techniques for proving cell probe lower bounds unfortunately 
yield just polylogarithmic bounds. In the following, we give a brief overview of the highest lower 
bounds that has been achieved since the introduction of the model, and also the most promising 
line of attack towards polynomial lower bounds. 

Static Data Structures. One of the most important early papers on cell probe lower bounds 
for static data structures is the paper of Miltersen et al. [15]. They demonstrated an elegant re- 
duction to data structures from an assymmetric communication game. This connection allowed 
them to obtain lower bounds of the form t q = OQgm/lgiS 1 ), where m denotes the number of 
queries to the data structure problem, S the space usage in number of cells and t q the query 
time. Note however that this bound is insensitive to polynomial changes in S and cannot give 
super-constant lower bounds for problems where the number of possible queries is just polyno- 
mial in the input size (which is true for most natural problems). This barrier was overcome 
in the seminal work of Patra§cu and Thorup [19], who extended the communication game of 
Miltersen et al. [T5] and obtained lower bounds of t q = Q(lgm/ \g{St q /n)), which peaks at 
t q = 0(lgm/lglgm) for data structures using npoly(lgm) space. 
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An alternative approach to static lower bounds was given by Panigrahy et al. |16j . Their 
method is based on sampling the cells of a data structure and showing that many queries can be 
answered from a small set of cells if the query time is too small (we note that similar ideas have 
been used for succinct data structure lower bounds, see e.g. [9]). The maximum lower bounds 
that can be obtained from this technique are of the form t q = 0,(\gm/lg(S/n)), see [13J. For 
linear space, this reaches t q = f2(lgm), which remains the highest static lower bound to date. 

Dynamic Data Structures. The first technique for proving lower bounds on dynamic data 
structures was the chronogram technique of Fredman and Saks [7]. This technique gives lower 
bounds of the form t q = fi(lgn/ lg(it?t u )) and plays a fundamental role in all later techniques 
for proving dynamic data structure lower bounds. Patra§cu and Demaine |18| extended the 
technique of Fredman and Saks with their information transfer technique. This extension 
allowed for lower bounds of m&x{t q ,t u } = O(lgn). Very recently, Larsen [12] combined the 
chronogram technique of Fredman and Saks with the cell sampling method of Panigrahy et al. 
to obtain a lower bound of t q = Q((lgn/lg(z/;t {i )) 2 ), which remains the highest lower bound 
achieved so far. 

Conditional Lower Bounds. Examining all of the above results, we observe that no lower 
bound has yet exceeded max{t u , t q } = f2((lg nj lg lg n) 2 ) in the most natural case of polynomially 
many queries, i.e. m = poly(n). In an attempt to overcome this barrier, Patra§cu [T7] defined 
a dynamic version of a set disjointness problem, named the multiphase problem. We study 
problems that are closely related to the multiphase problem, so we summarize it here: 

The Multiphase Problem. This problem consists of three phases: 

• Phase I: In this phase, we receive k sets Si, ... , S^, all subset of a universe [n]. We are 
allowed to preprocess these sets into a data structure using time 0(rkn). 

• Phase II: We receive another set T C [n] and have time O(rn) to read and update cells 
of the data structure constructed in Phase I. 

• Phase III: We receive an index i £ [k] and have time O(r) to read cells of the data 
structure constructed during Phase I and II in order to determine whether Si n T = 0. 

Patra§cu conjectured that there exists constants fi > 1 and e > such that any solution 
for the multiphase problem must have r = f2(n e ) when k = n^, i.e. for the right relationship 
between n and k, any data structure must have either polynomial preprocessing time, update 
time or query time. Furthermore, he reduced the multiphase problem to a number of natural 
data structure problems, including e.g. the following problems. 

• Reachability in Directed Graphs. In a preprocessing phase, we are given a directed 
graph with n nodes and m edges. We are then to support inserting directed edges into 
the graph. A query is finally specified by two nodes of the graph, u and v, and the goal 
is to determine whether there exists a directed path from u to v. 

• Subgraph Connectivity. In a preprocessing phase, we are given an undirected graph 
with n nodes and m edges. We are then to turn nodes on and off. A query is finally 
specified by two nodes of the graph, it and v, and the goal is to determine whether there 
exists a path from u to v using only on nodes. 

We also mention the following problem, which was shown in [2] to solve the multiphase problem. 
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• Range Mode. In a preprocessing phase, we are given an array A[l : n] = {^4[1], • • • , 

of integers and are to support value updates A[i] <— A[i] + x. Queries are specified by 
two indicies i and j, and the goal is to find the most frequently occuring integer in the 
subarray A[i : j]. 

These reductions imply polynomial lower bounds for the above problems, if the multiphase 
problem has a polynomial lower bound. Thus it seems fair to say that studying the multiphase 
problem is the most promising direction for obtaining polynomial data structure lower bounds. 

1.3 Non-Adaptivity 

Given that we are generally clueless about how to prove polynomial lower bounds in the cell 
probe model, it is natural to investigate under which circumstances such bounds can be achieved. 
In this paper, we study the performance of data structures that are non-adaptive. To make the 
notion of non-adaptivity precise, we define it in the following: 

• Non-Adaptive Query Algorithm. A cell probe data structure has a non-adaptive 
query algorithm, if the cells it probes when answering a query depend only on the query, 
and not on the contents of previously probed cells. 

• Non- Adaptive Update Algorithm. Similarly, a cell probe data structure has a non- 
adaptive update algorithm, if the cells it probes when processing an update depend only 
on the update, and not on the contents of previously probed cells. 

• Memoryless Update Algorithm. In this paper, we also study a slighlty more restrictive 
type of update algorithm. A cell probe data structure has a memoryless update algorithm, 
if the update algorithm is both non-adaptive, and furthermore, the contents written to a 
cell during an update depend only on the update and the current contents of the cell, i.e., 
they may not depend on the contents of other cells probed during the update operational 

• Linear Data Structures. Finally, we study a sub-class of the data structures with a 
memoryless update algorithm, which we refer to as linear data structures. These data 
structures are defined for problems where the input can be interpreted as an array A of n 
bits and an update operation can be interpreted as flipping a bit of A (from to 1 or 1 to 
0). A linear data structure has non-adaptive query and update algorithms. Furthermore, 
when processing an update, the contents of all probed cells are simply flipped, and on a 
query, the data structure returns the XOR of the bits stored in all the probed cells. Note 
that these data structures use only a word size of w = 1 bit, every cell stores a linear 
combination over the bits of A (mod 2) and a query again computes a linear combination 
over the stored linear combinations (mod 2). 

While linear data structures might appear to be severly restrictive, for many data structure 
problems (particularly in the area of range searching), natural solutions are in fact linear. An 
example is the well-studied prefix sum problem, where the goal is to dynamically maintain an 
array A of bits under flip operations, and a query asks for the XOR of elements in a prefix range 
A[l . . . k]. One-dimensional range trees are linear data structures that solve prefix sum with 

A caveat on the semantics of updates: in this work, we assume updates specify how data changes (e.g. 
updates are of the form A[k] A[k] + A) as opposed to specifying new values for data (e.g. updates of the form 
A[k] <— v). The latter notion goes against the notion of non-adaptive updates, since to rewrite a cell, one must 
know how an update changes data. One solution is to assume that the data structure stores raw data directly, 
and to allow memoryless updates to depend on the current contents of a cell, the update, and the previous value 
of the update. We view this issue as largely semantic, and do not discuss it further. 
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update and query time O(lgra). This is optimal when memory cells store only single bits [18J, 
even for adaptive data structures. More elaborate problems in range searching would be: Given 
a fixed set P of n points in <i-dimensional space, support deleting and re-inserting points of P 
while answering queries of the form "what is the parity of the number of points inside a given 
query range?". Here query ranges could be axis-aligned rectangles, halfspaces, simplices etc. 
We note that all the known data structures for range counting can easily be modified to yield 
linear data structures when given a fixed set of points P, and still, this setting seems to capture 
the hardness of range counting. 

The main difference between non-adaptive and memoryless update algorithms is that non- 
adaptive update algorithms may move the information about an update operation around the 
data structure, even on later updates. As an example, consider a data structure with a non- 
adaptive update algorithm and two possible updates, say updates u\ and ui- Even if the data 
structure only probes the first memory cell on update u%, information about u\ can be stored 
many other places in the data structure. Imagine the data structure initially stores the value 
in the first memory cell. Whenever update u\ is performed, the data structure increments the 
contents of the first memory cell by one. On update 112, the data structure copies the contents 
of the first memory cell to the second memory cell. Clearly both operations are non-adaptive, 
and we observe that whenever we have performed update 112, the second memory cell stores 
the number of times update u\ has been performed, even though u\ never probes the cell. 
For memoryless updates, information about an update is only stored in cells that are actually 
probed when processing the update operation. 

Linear data structures are inherently memoryless. However, some features possible with 
memoryless updates are not available to linear data structures. For example, memoryless update 
algorithms can support cells that maintain a count of the total number of updates executed. 
This is not possible with linear data structures, since the contents of each cell is a fixed linear 
combination of the data being stored. 

1.4 Our Results 

The main result of this paper, is to demonstrate that polynomial cell probe lower bounds can 
be achieved when we restrict data structures to be non-adaptive. In Section [2] we also prove 
lower bounds for data structures where only the query algorithm is non-adaptive. The concrete 
data structure problem that we study in this setting is the following indexing problem. 

Indexing Problem. In a preprocessing phase, we receive a set of k binary strings S\, . . . , S^, 
each of length n. We are then to support updates, consisting of an index j 6 [n], which we 
think of as an index into the strings S\, . . . , Sk- A query is finally specified by an index i € [k] 
and the goal is to return the j'th bit of Si. 

Theorem 1. Any cell probe data structure solving the indexing problem with a non-adaptive 
query algorithm must either have t q = £l{n/w) or t u = Vt{k/w), regardless of the preprocessing 
time and space usage. 

Examining this problem, one quickly observes that it is a special case of the multiphase 
problem presented in Section 11.21 thus by setting the parameters in the reductions of [TTJ, [2] 
correctly we obtain, amongst others, the following lower bounds as an immediate corollary of 
our lower bound for the indexing problem: 

Corollary 1. Any cell probe data structure that uses a non-adaptive query algorithm to solve 
(i) reachability in directed graphs or (ii) subgraph connectivity must either have t q = £l(n/w) or 
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t u = Q(n/w). Any cell probe data structure that solves range mode with a non-adaptive query 



In Section [21 we prove lower bounds for data structures where the query algorithm is allowed 
to be adaptive, but the update algorithm is memoryless. Again, we prove our lower bound for 
a special case of the multiphase problem: 

Set Disjointness Problem. In a preprocessing phase, we receive a subset S of a universe 
[n]. We are then to support inserting elements x E [n] into an initially empty set T. Finally a 
query simply asks to return whether S H T = 0, i.e. the problem has just one query. 

Theorem 2. Any cell probe data structure solving the set disjointness problem with a memory- 
less update algorithm must have t q = Q(n/w), regardless of the preprocessing time, space usage 
and update time. 

Again, using the reductions of [171 [2], we obtain the following lower bounds as a corollary 
of our lower bound for the set disjointness problem: 

Corollary 2. Any cell probe data structure that uses a memoryless update algorithm to solve 
(i) reachability in directed graphs, (ii) subgraph connectivity, or (Hi) range mode must have 
t q = Q(n/w). 

Finally, in Section [31 we show a strong connection between nonadaptive data structures 
and the wire complexity of depth-2 circuits. In these circuits, gates have unbounded fan-in and 
fan-out and compute arbitrary functions. Thus, trivial bounds on the number of gates exist. 
Instead, the size of a circuit s(C) is defined to be the number of wires. 

Proving lower bounds on the size of circuits computing explicit operators F : {0, l} n — > 
{0, l} m has been studied in several works. In particular, Valiant j20] showed that an w(n 2 /(lg lg n)) 
bound for circuits computing F implies that F cannot be computed by log-depth, linear size, 
bounded fan-in circuits. Currently, the best bounds known for an explicit operator are 0(n 3//2 ). 
Cherukhin |6j gave such a bound for circuits computing cyclic convolutions. Jukna |10J gave 
a similar lower bound for circuits computing matrix multiplication, and developed a general 
technique for proving such lower bounds, formalizing the intuition in [6]. 

First, we show how to use simple encoding arguments common to data structure lower 
bounds to achieve circuit lower bounds, using matrix multiplication as an example. Our bound 
matches the result from JlOJ , but yields a simpler argument. We discuss Jukna's technique in 
more detail in Section [3j 

Theorem 3 (jlUj). Any circuit computing matrix multiplication has size at least n 3 / 2 . 

Depth-2 circuits computing explicit linear operators are of particular interest. Currently, 
the best lower bound for an explicit linear operator is the recent 0(n(lgn/lglgn) 2 ) bound of 
Gal et al. [8] for circuits that compute error correcting codes. Another interesting question is 
whether general circuits are more powerful than linear circuits for computing linear operators. 
Linear circuits use only XOR gates; i.e., each gate outputs a linear combination in GF(2) over 
its inputs. 

We show a generic connection between linear data structures and linear circuits. Define a 
problem V as a mapping F-p = {fx, . . . , f m ) : {0, l} n ->■ {0, where each fj : {0, l} n ->■ {0, 1}. 
For linear data structures, think of the domain {0, l} n as the input array A with n bits, and 
view each fj as a query, where fj(A) is the answer to the query on the input A. A linear 
data structure hence solves V, if after any sequence of updates to A, it holds for all 1 < j < m 
that answering the query fj returns the value fj(A). 
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Lemma 1. // there is a linear data structure for a problem V with query time of t q and update 
time t u , then there exists a depth-2 linear circuit C computing F-p with size s(C) < nt u + mt q . 

If there is a depth-2 linear circuit C that computes F-p, then there is a linear data structure 
forV with average query time at most s(C)/m and average update time at most s(C)/n. 

Lemma [1] thus gives a new way to attack circuit lower bounds. We believe the connection 
between non-adaptive data structures and depth-2 circuits has the potential to yield strong in- 
sight to this problem, and that several linear operators conjectured to have strong data structure 
lower bounds are good candidates for hard circuit problems (for linear or general circuits). 

Apart from being interesting lower bounds in their own right, we believe our results shed 
much light on the inherent difficulties of proving polynomial lower bounds in the cell probe 
model. In particular the movement of data when performing updates (see the discussion in 
Section fl.3p appears to be a major obstacle. We conclude in Section [3] with a discussion of our 
results and potential directions for future research. 

2 Lower Bounds 

In this section, we first prove lower bounds for data structures where only the query algorithm 
is assumed non-adaptive. The problem we study is the indexing problem defined in Section H. 41 

Theorem 4 (Restatement of Theorem [I]). Any cell probe data structure solving the indexing 
problem with a non-adaptive query algorithm must either have t q = Q(n/w) or t u = Vi{k/w), 
regardless of the preprocessing time and space usage. Here t q denotes the query time, t u the 
update time and w the cell size in bits. 

We prove this using an encoding argument. Specifically, consider a game between an encoder 
and a decoder. The encoder receives as input k binary string Si, . . . , Sk, each of length n 
and must from this send a message to the decoder. From the message alone, the decoder 
must uniquely recover all the strings S\ , . . . , Sk ■ If the strings S\ , . . . , Sk are drawn from a 
distribution, then the expected length of the message must be at least H(S± • • • Sk), or we have 
reached a contradiction. Here H(-) denotes Shannon entropy. 

The idea in our proof is to assume for contradiction that a data structure for the indexing 
problem exists with a non-adaptive query algorithm that simultaneously has t q = o(n/w) and 
t u = o(k/w). Using this data structure as a black box, we construct a message that is shorter 
than H(Si ■ ■ ■ Sk), but at the same time, the decoder can recover Si, . . . , Sk from the message, 
i.e. we have reached the contradiction. We let the k strings Si, . . . ,Sk given as input to the 
encoder be uniform random bit strings of length n. Clearly H(S\ ■ ■ ■ Sk) = kn. 

Encoding Procedure. When given the strings Si,...,S}~ as input, the encoder first runs the 
preprocessing algorithm of the claimed data structure on Si , . . . , Sk ■ He then examines every 
possible query index i 6 [A;], and for each i, collects the set of addresses of the cells probed on 
query i. Since the query algorithm is non-adaptive, these sets of addresses are independent of 
Si,...,Sk and any updates we might perform on the data structure. Letting C denote the set 
containing all these addresses for all i, the encoder starts by writing down the concatenation of 
the contents of all cells with an address in C. This constitutes the first part of the message. 

The encoder now runs through every possible update j G [n]. For each j, he runs the update 
algorithm as if update j was performed on the data structure. While running update j, the 
decoder appends the contents of the probed cells (as they are when the update reads the cells, 
not after potential changes) to the constructed message. After processing all j's, the encoder 
finally sends the constructed message to the decoder. This completes the encoding procedure. 



6 



Decoding Procedure. The decoder receives as input the message consisting first of the con- 
tents of all cells with an address in C after preprocessing S\, . . . , Sk- Since the query algorithm 
is non-adaptive, the decoder knows the addresses of all these cells simply by examining the 
query algorithm of the claimed data structure. The decoder will now run the update algorithm 
of every j E [n]. While doing this, he maintains the contents of all cells in C and all cells probed 
during the updates. Specifically, the decoder does the following: 

For each j = 1, ... ,n in turn, he starts to run the update algorithm for j. Observe that 
the contents of each probed cell (before potential changes) can be recovered from the message 
(the contents appear one after another in the message). This allows the decoder to completely 
simulate the update algorithm for each j = 1, . . . , n. Note furthermore that for each cell that is 
probed during these updates, the address can also be recovered simply by examining the update 
algorithm. In this way, the decoder always knows the contents of all cells in C and all cells 
probed by the update algorithm as they would have been after preprocessing Si,...,Sk and 
performing the updates after this preprocessing. While processing the updates j = 1, . . . ,n, the 
decoder also executes a number of queries: After having completely processed an update j, the 
decoder runs the query algorithm for every i E [k]. Note that the decoder knows the contents 
of all the probed cells as if the preprocessing on Si, ■ ■ ■ ,Sk had been performed, followed by 
updates j' = 1, . . . ,j. This implies that the simulation of the query algorithm for each i E [k] 
terminates precisely with the answer being the j'th bit of S{. It follows immediately that the 
decoder can recover every bit of every Si from the message. 

Analysis. What remains is to analyze the size of the message. Since by assumption, the 
query time is t q = o[n/w), the first part of the message has t q kw = o(kn) bits. Similarly, we 
assumed t u = o(k/w), thus the second part of the message has t u nw = o(kn) bits. Thus the 
entire message has o(kn) bits. Since H(Si ■ ■ ■ Sk) = kn, we have reached our contradiction. 
This completes the proof of Theorem [TJ 

Next, we prove lower bounds for data structures where only the update algorithm is assumed to 
be memoryless, that is, we allow the query algorithm to be adaptive. In this setting, we study 
the set disjointness problem defined in Section [1.41 

Theorem 5 (Restatement of Theorem [2]). Any cell probe data structure solving the set dis- 
jointness problem with a memoryless update algorithm must have t q = £l(n/w), regardless of 
the preprocessing time, space usage and update time. Here t q denotes the query time and w the 
cell size in bits. 

Again, we prove this using an encoding argument. In this encoding proof, we let the input 
of the encoder be a uniform random set S C [n]. Clearly H(S) = n bits. We now assume 
for contradiction that there exists a data structure for the set disjointness problem with a 
memoryless update algorithm and at the same it has query time t q = o(n/w). The encoder uses 
this data structure to send a message encoding S in less than n bits, i.e. a contradiction. 

Encoding Procedure. When the encoder receives S, he runs the preprocessing algorithm of 
the claimed data strucutre. Then, he computes S = [n] \ S and inserts S into the data structure 
as the set T. Finally, the encoder runs the query algorithm and notes the set of cells C probed. 
Note that by the choice of S, the query algorithm will output disjoint, and furthermore, S is 
the largest possible set that will result in a disjoint answer. 

The encoding consists of three part^f): (i) the addresses of the cells in C, (ii) the contents 

2 In fact, it is possible for the decoder to recover C from the second two parts of the encoding, so the first part 
is unnecessary. However, this does not materially affect our lower bound, so we omit the details. 
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of the cells in C after preprocessing but before inserting S, and (iii) the contents of the cells in 
C after inserting S. 

Decoding Procedure. The decoder iterates over all sets S' C [n]. Each time, the decoder 
initializes the contents of cells in C to match the second part of the encoder's message. Then, 
he inserts each element of S' into the data structure, changing the contents of any cell in C 
where appropriate. When a cell outside of C is to be changed, the decoder does nothing. Since 
the update algorithm is memoryless, this procedure ends with all cells in C having the same 
contents as they would have had after preprocessing S and inserting elements of S' . Moreover, 
if the contents match the contents written down in the third part of the encoding, then it must 
be that S and S' are disjoint (we know that the query answers disjoint when the contents of 
C are like that). When S' = S, the contents of C will match the last part of the encoding, 
and it is trivially the largest set to do so. Thus, the decoder selects the largest set S* whose 
updates to C match the contents written down in the third part of the encoding. In this way, 
the decoder recovers S = [n] \ S*. 

Analysis. Finally, we analyze the size of the encoding. Since we assumed t q = o(n/w), the 
encoding has size 3t q w = o(n) bits. But H(S) = n, thus we have reached a contradiction. 

3 Circuits and Non- Adaptive Data Structures 

In this section, we demonstrate a strong connection between non-adaptive data structures and 
the wire complexity of depth-2 circuits. A depth-2 circuit computing F = (/i,...,/ m ) : 
{0, l} n —> {0, l} m is a directed graph with three layers of vertices. The first layer consists 
of n input nodes, labeled xi, . . . , x n G {0, 1}. Vertices in the second layer are interior gates and 
output boolean values. The last layer consists of m output gates, labeled z±, . . . , z m G {0, 1}. 
There are edges between input nodes and interior gates, and between interior gates and output 
gates. Each gate computes an arbitrary function of its inputs. Since non-input nodes compute 
arbitrary functions, / can be trivially computed using m gates. Instead, we define the size s{C) 
of a depth-2 ciruit C as the total number of wires in it; i.e., the number of edges in the graph. 

First, we show how to use the encoding technique common to data structure lower bounds 
to achieve size bounds for depth-2 circuits. As a proof of concept, we prove such a lower bound 
for matrix multiplication. We say that a circuit computes matrix multiplication if there are 
n = 2m inputs, each corresponding to an entry in one of two \Jn x y/n binary matrices A and 
B, and each output gate computes an entry in the product A ■ B. Arithmetic is in GF(2). 

Jukna [10] considered depth-2 circuits and gave an n 3//2 lower bound for circuits computing 
boolean matrix multiplication. At a high level, his proof proceeds in the following fashion. 

1. Partition input nodes into sets 1%, . . . , It and output gates into sets J\, . . . , Jt- 

2. Prove that for each 1 < I < t, the number of wires leaving inputs from l£ plus the number 
of wires entering outputs in Ji must be large. 

3. Conclude a large lower bound by summing the terms from Step[2j 

Note that since and {Je} are partitions, they induce a partition on the wires in the circuit. 
Jukna proves Step [2] by proving lower bounds on what he calls the entropy of an operator. He 
proves a lower bound on the entropy of an operator by carefully analyzing subfunctions of the 
operator. In the case of matrix multiplication, subfunctions are created by fixing entries in B 
to be all zero, except for a single cell B[k, £]. Each 2^, Jg represents a column in B and in A ■ B 
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respectively. By ranging over different k, £, Jukna is able to argue that the entropy of matrix 
multiplication is high. The details of this argument are technical. 

We give a new proof for Step [5] using an encoding argument. The encoder exploits the circuit 
operations to encode a \fn x yjn matrix A. The encoded message has length precisely equal to 
the nubmer of outgoing wires in In and incoming wires to Ji . The argument is very similar to 
the arguments in Section [2j we leave it to the full version of the paper for lack of space. 

Theorem 6. Any circuit C computing boolean matrix multiplication has size s(C) > n 3 / 2 . 

Finally, we provide a strong connection between depth-2 linear circuits and linear data 
structures. The connection is almost immediately established: 

Lemma 2 (Restatement of Lemma [1]). If there is a linear data structure for a problem V with 
query time of t q and update time t u , then there exists a depth-2 linear circuit C computing F-p 
with size s(C) < nt u + mt q . 

If there is a depth-2 linear circuit C computing F-p, then there is a linear data structure for 
V with average query time at most s(C)/m and average update time at most s(C)/n. 

Proof. First, suppose there exists a linear data structure solving V . We construct the corre- 
sponding depth-2 circuit directly. Input nodes correspond to the n bits of the input (the array A 
in the definition of linear data structures). Output nodes correspond to the m possible queries, 
and there is an interior node for each cell in the database. For each update 1 < i < n (flip an 
entry of A), add edges from X{ to each of the cells updated by the data structure. Similarly, 
add wires (cj, Zj) whenever the j'th query probes the ith cell in the data structure. Correctness 
follows immediately. Finally, note that since updates and queries probe at most t u and t q cells 
respectively, the total number of wires in the circuit is bounded by s(C) < nt u + mt q . 

Constructing a linear data structure from a linear depth-2 circuit C is similar. Letting t u ^ 
and t q j denote the number of cells probed during the ith update and jth query respectively, it is 
easy to see that s(C) = Ya=i W + Sj=i tq,j- ^ follows that the average update time is at most 
^ tu,i < s(C)/n, and similarly that the average query time is at most ^2 t q ,j < s(C)/m. □ 

The main contribution of Lemma [2] is a new range of candidate hard problems for linear 
circuits, all inspired by data structure problems. As mentioned in Section II. 3| linear data 
structures most naturally occur in the field of range searching. Furthermore, these data structure 
problems turn out to correspond precisely to linear operators: Let P = {p\, . . . ,p n } be a fixed 
set of n points in M. d , and let 71 be a set of query ranges, where each Ri G TZ is a subset of 
W 1 . P and 7Z naturally define a linear operator A(P, 1Z) G {0, 1} 1^1 x 1^1, where the ith row of 
A(P, TV) has a 1 in the jth column if pj G Ri and otherwise. In the light of Lemma O assume 
a linear data structure solves the following range counting problem: Given the fixed set of 
points P, each assigned a weight in {0, 1}, support flipping the weights of the points (intuitively 
inserting/deleting the points) while also supporting to efficiently compute the parity of the 
weights assigned to the points inside a query range Ri G 1Z. Then that linear data structure 
immediately translates into a linear circuit for the linear operator A(P, TV) and vice versa. 
Thus we expect that hard range searching problems of the above form also provide hard linear 
operators for linear circuits. The seemingly hardest range searching problem is simplex range 
searching, where we believe that the following holds: 

Conjecture 1. There exists a constant e > 0, a set 1Z of 0(ra) simplices in R d and a set 
of n points in M. d , such that any data structure solving the above range counting problem (flip 
weights, parity queries), must have average query and update time t u t q = Q(n e ). 
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We have toned down Conjecture [T] somewhat, since the community generally believe e can 
be replaced by 1 — 1/d, but to be on the safe side we only conjecture the above. In the circuit 
setting, this conjecture translates to 

Corollary 3. If Conjecture [7] is true for linear data structures, then there exists a constant 
5 > 0, a set TZ of 0(n) simplices in M. d and a set P of n points, such that any linear circuit 
computing the linear operator A(P,TZ) must have fi(n 1+<5 ) wires. 

Furthermore, the research on data structure lower bounds also provide a lot of insight into 
which concrete sets P and 7Z that might be difficult. More specifically, polynomial lower bounds 
for simplex range searching has been proved for: range reporting in the pointer machine [5j [1] 
and I/O-model pQ, range searching in the semi- group model [3] and range searching in the 
group model (TTJ Q3]. The group model comes closest in spirit to linear data structures. A 
data structure in the group model is essentially a linear data structure, where instead of storing 
linear combinations over GF(2), we store linear combinations with integer coefficients (and no 
mod operations). Similarly, queries are answered by computing linear combinations over the 
stored elements, but with integer coefficients and not over GF(2). The properties used to drive 
home range searching lower bounds in the group model are: 

• If A(P, TZ) has polynomial red-blue discrepancy, then any group model data structure 
must have t u t q = £l(n e ) for some constant e > 0. 

• If A(P,7Z) has O(n) eigenvalues that are polynomial, then any group model data structure 
must have t u t q = £l(n £ ) for some constant e > 0. 

• If \Ri H P| is polynomial for all Ri 6 TZ and \Ri n Rj H P\ = 0(1) for all i / j, then any 
group model data structure must have t u t q = f2(n e ) for some constant e > 0. 

The last property directly translates to A(P,TZ) having rows and columns with polynomially 
many Is and any two rows/columns having a constant number of Is in common. Given the 
tight correspondence between group model data structures and linear data structures, we believe 
these properties are worth investigating in the circuit setting. Furthermore, a concrete set of n 
points P and a set of Q(n) simplices TZ, with all three properties, is known even in R 2 . This 
example can be found in [2], where it is stated for TZ being lines (i.e. degenerate simplices). 
Note that the lower bound in [4J is for range reporting in the pointer machine, but using the 
observations in [11\ HH] it is easily seen that all the above properties hold. 

Even if these properties are not enough to obtain lower bounds for linear operators, we 
believe the geometric approach might be useful in its own right. 

4 Conclusion 

In this paper, we have studied the role non-adaptivity plays in dynamic data structures. Surpris- 
ingly, we were able to prove polynomially high lower bounds for such data structures. Perhaps 
more importantly, we believe our results shed much new light on the current polylogarithmic 
barriers if we do not make any restrictions on data structures. We also presented an interesting 
connection between data structures and depth-2 circuits. The connection between linear op- 
erators and range searching is particularly intriguing, revealing a number of new properties to 
investigate further in the realm of circuit lower bounds. 
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A A Lower Bound Proof for Matrix Multiplication 

Theorem 7 (Restatement of Theorem [6]) . Any circuit C computing boolean matrix multiplica- 
tion has size s(C) > n 3 / 2 . 

Proof. Fix a circuit C. Let P = A - B. For 1 < £ < ^fn, let In denote the £th column of B; that 
is, In consists of all inputs corresponding to B[k,£] for some k. Similarly, Ji is the set of all 
outputs corresponding to the £th column of P; that is, all outputs given by P[k,£] for some k. 
Let t Ut £ denote the number of wires leaving inputs in In. Similarly, let t Qi e denote the number 
of wires entering outputs in J^. 

Claim 1. For any I, we have t u n + t q n > n. 

Before proving this claim, note that Theorem [6] follows directly, since there are y/n pairs 
{In, Ji) and the wires corresponding to each pair are disjoint. □ 

Proof of Claim{J\ This proof will involve an encoding argument. The encoder will receive a 
sfn x y/n boolean matrix M, where M is drawn uniformly amongst all such boolean matrices. 
He will then use the matrix multiplication circuit to encode M in such a way that the size of 
the encoding depends on the wires leaving In and entering Jn. 

Encoding Procedure. The encoder receives M. As a first step, he sets ^4[i, j] <— M[i,j] for 
all he also sets all entries in B to zero. He then writes down the output of all interior gates 
adjacent to an output in Jn- In the second step, for each 1 < k < ^/n, the encoder performs the 
following: he sets B[k,£] 1 and sets all other entries in B to zero. He then writes down the 
output of all interior gates adjacent to B[k,£]. This completes the encoding procedure. 

Decoding Procedure. Note that P[i,^] = A[i,j]B[j,£]. In particular, when B consists 
of a 1 in entry [k,£] and zero in all other entries, then the £th column of P corresponds to the 
kth column of A. The decoder thus recovers the kth column of M by using C to compute the 
£th column of P, i.e., by querying all outputs in Jn. For each output gate in Jn, she looks at 
all interior gates adjacent to it. For each of these gates g, the decoder checks to see if g is 
adjacent to the input gate B[k,£]. If so, then she recovers the correct output value of this gate 
from the second part of the encoding. Otherwise, she recovers the correct output from the first 
part (noting that in this case, changing the value of B[k,£] does not affect g). In this way, the 
decoder recovers the £th column of C, which is also the kth column of A, which is again the 
kth column of M. Doing this for all k completes the decoding. 
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Analysis. The first part of the encoding consists of the output of each interior gate adjacent 
to at least one output in Jg. Thus, the first part of the encoding can be described in at most 
t q £ bits. The second part of the encoding consists of the output of each interior gate adjacent 
to each input node in lg. This requires at most t u / bits. Thus, the total length of the encoding 
is at most t u ^ + t q ^. The decoder recovers all of M from this message. Since each entry of M 
is independent and uniform, H(M) = n. Thus, t U; £ + t q ^ >n. □ 

Remark. As mentioned previously, Jukna proves his lower bounds by defining the entropy of 
an operator. He lower bounds the wire complexity of a circuit by the entropy of the operator 
it computes. He proves a lower bound on the entropy of an operator by carefully analyzing 
subfunctions of the operator, created by fixing subsets of the variables to specific values and 
considering the induced function on the remaining variables. 

Parts of Jukna's proof are similar in spirit to ours. In particular, the way we encode M by 
fixing the matrix B to be one in entry [k, I] and zero elsewhere corresponds to the subfunctions 
Jukna considers in his proof. In fact, we believe that any lower bound provable using Jukna's 
technique can also be proved using our method. Our advantage is in replacing Jukna's technical 
and somewhat complicated machinery with a simple encoding argument. 
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